Syntactic Translation Patterns from a Parallel Treebank

نویسنده

  • Mihaela Colhon
چکیده

The goal of the presented parallel phrase extraction algorithm is to provide rich and robust set of translation syntactic patterns. To make this approach feasible, we consider the phrase-to-phrase alignments of a bilingual treebank annotated with syntactic constituents. For the intended purpose, the extracted phrasal nodes are encoded by the syntactical information of their components, highlighting some special constructs such as the functional words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building A Case-based Semantic English-Chinese Parallel Treebank

Abstract We construct a case-based English-to-Chinese semantic constituent parallel Treebank for a Statistical Machine Translation (SMT) task by labelling each node of the Deep Syntactic Tree (DST) with our refined semantic cases. Since subtree span-crossing is harmful in tree-based SMT, DST is adopted to alleviate this problem. At the same time, we tailor an existing case set to represent bili...

متن کامل

Aligning Chinese-English Parallel Parse Trees: Is it Feasible?

We investigate the feasibility of aligning Chinese and English parse trees by examining cases of incompatibility between Chinese-English parallel parse trees. This work is done in the context of an annotation project wherewe construct a parallel treebank by doingword and phrase alignments simultaneously. We discuss the most common incompatibility patterns identified within VPs and NPs and show ...

متن کامل

Automatic Phrase Alignment Using statistical n-gram alignment for syntactic phrase alignment

A parallel treebank consists of syntactically annotated sentences in two or more languages, taken from translated (i.e. parallel) documents. These parallel sentences are linked through alignment. Much work has been done on sentence and word alignment, but not as much on the intermediate level. This paper explores using n-gram alignment created for statistical machine translation based on GIZA++...

متن کامل

LinES: An English-Swedish Parallel Treebank

This paper presents an English-Swedish Parallel Treebank, LinES, that is currently under development. LinES is intended as a resource for the study of variation in translation of common syntactic constructions from English to Swedish. For this reason, annotation in LinES is syntactically oriented, multi-level, complete and manually reviewed according to guidelines. Another aim of LinES is to su...

متن کامل

Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions

We present Poly-GrETEL, an online tool which enables syntactic querying in parallel treebanks and which is based on the monolingual GrETEL environment. We provide online access to the Europarl parallel treebank for Dutch and English, allowing users to query the treebank using either an XPath expression or an example sentence in order to look for similar constructions. We provide automatic align...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012